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A CLASS  OF  OPTIMAL-ORDER  ZERO-FINDING  METHODS 
USING  DERIVATIVE  EVALUATIONS 

Richard  P.  Brent 

Computer  Centre, 

Australian  National  University, 

Canberra,  A.C.T.  2600,  Australia 

1.  INTRODUCTION 

It  is  often  necessary  to  find  an  approximation  to  a 
simple  zero  £ of  a function  f , using  evaluations  of  f and 
f'  . In  this  paper  we  consider  some  methods  which  are 
efficient  if  f'  is  easier  to  evaluate  than  f . Examples  of 
such  functions  are  given  in  Sections  5 and  6. 

The  methods  considered  are  stationary,  multipoint,  iter- 
ative methods,  "without  memory"  in  the  sense  of  Traub  [64], 
Thus,  it  is  sufficient  to  describe  how  a new  approximation 
(Xj)  is  obtained  from  an  old  approximation  (xQ)  to  C . 
Since  we  are  interested  in  the  order  of  convergence  of  differ- 
ent methods,  we  assume  that  f is  sufficiently  smooth  near 
C , and  that  Xq  is  sufficiently  close  to  C . Our  main 
result  is: 

Theorem  1.1 

There  exist  methods,  of  order  2v  , which  use  one  evalu- 
ation of  f and  v evaluations  of  f’  for  each  iteration. 

By  a result  of  Meersman  and  Wozniakowski,  the  order  2v 
is  the  highest  possible  for  a wide  class  of  methods  using  the 
same  information  (i.e.,  the  same  number  of  evaluations  of  f 
and  f ’ per  iteration) : see  Meersman  [75] . The  "obvious" 
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interpolator/  methods  have  order  v + 1 , but  the  optimal  or- 
der 2v  may  be  obtained  by  evaluating  f'  at  the  correct 
points.  These  points  are  determined  by  some  properties  of 
orthogonal  and  "almost  orthogonal"  polynomials. 

If  v + 1 evaluations  of  f are  used,  Instead  of  one  func- 
tion evaluation  and  v derivative  evaluations,  then  the  opti- 
mal order  is  2V  for  methods  without  memory  (Kung  and  Traub 
[73,74],  Wozniakowski  [75a, b]),  and  2 ^ for  methods  with 
memory  (Brent,  Winograd  and  Wolfe  [73]).  Thus,  our  methods 
are  only  likely  to  be  useful  for  small  v or  if  f'  is  much 
cheaper  than  f. 

Special  Cases 

Our  methods  for  v ^ 3 appear  to  be  new.  The  cases  v “ 1 
(Newton'  3 method)  and  \>  ■ 2 (a  fourth-order  method  of  Jarratt 
[69])  are  well  known.  Our  sixth-order  method  (with  v “ 3) 
improves  on  a fifth-order  method  of  Jarratt  [70], 

Generalizations 

Generalizations  to  methods  using  higher  derivatives  are 
possible.  One  result  is: 

Theorem  1.2 

For  m > 0 , n H , and  k satisfying  m + 1 £ k > 0, 
there  exist  methods  which,  for  each  iteration,  use  one  evalu- 
ation of  f ,f ' , . . . ,f  ^ , followed  by  n evaluations  of  f^, 
and  have  order  of  convergence  m + 2n  + 1 . 

The  methods  described  here  are  special  cases  of  the 
methods  of  Theorem  1.2  (take  k = m = 1 , and  v = n + 1)  . 
Since  proof  of  Theorem  1.2  is  given  in  Brent  [75],  we  omit 
proofs  here,  and  adopt  an  informal  style  of  presentation. 

Other  possible  generalizations  are  mentioned  in  Section  7. 


}) 


where 


XN  * X0  - £0/f0 

is  the  approximation  given  by  Newton's  method,  and 

4 ■ lty£ol  ■ l*N  * xol 
Now  - 


CN  “ xi  * Of*5  ) * 


and 


f’(x)  - Q'(x)  = 0(6  ) 
for  x near  Xj^  , so 

|f(Xj)|  * |f(x,)  - Q(Xj) | 

s IfcV  - q(xn)|  . |f'(o  - Q’rol-I^  - *! 

Thus 


for  some  £ between  Xj^  and  x^ 

c4 


|f(Xj)|  = 0(6’)  ♦ 0(62-62)  » 0(64)  , 


and 


Xj  - ? - OtlfUj)!)  • 0(64)  - 0(|x0  - d4) 


3.  A SIXTH-ORDER  METHOD 


To  obtain  a sixth-order  method  using  one  more  derivative 
evaluation  than  the  fourth-order  method  described  above,  we 
need  distinct,  nonzero  parameters,  and  a2  , such  that 

P(0)  = P’ CO)  = P' (ax)  = P'(a2)  = 0 
implies  P(l)  * 0 , for  all  fifth-degree  polynomials 


P(x)  * a + bx  + ...  + fx  . 
Thus,  we  want  the  conditions 


20^0  + . . . + 5ajf  = 0 


and 

to  imply 


2a2c  + . . . + 5a2f  = 0 


c + ...  + f = 0 . 


ipppoppp  i > j 1 * j i * * *m  mmmmmm  mmmm  h'wiw 


Equivalently,  we  want 


rank 


i.e. , 


rank 


2«1 

3a2 

4“i 

5a* 

2<X2 

3a2 

5a* 

- 2 , 

1 

1 

1 

1 

1 

al 

o 

ai 

°f 

1 

a2 

4 

“2 

= 2 , 

1/2 

1/3 

1/4 

l/5_ 

and 

w2  , 

ai  * 

W2a2  = 

1/ (i 

♦ 2) 

(3.1) 
for  0 $ i $ 3 . 

1 i 

Since  1/ (i  + 2)  = / x *xdx  , we  see  from  (3.1)  that 

and  a,  should  be  chose&  as  the  zeros  of  the  Jacobi  poly- 
1 2 

nomial,  G2(2,  2,  x)  = x - 6x/5  + 3/10  , which  is  orthogonal 
to  lower  degree  polynomials,  with  respect  to  the  weight  func- 
tion x , on  [0,  1]  . 

Let  = x0  " aiVf0  * XN  = x0  “ f0^£0  * 6 = 

and  let  Q(x)  be  the  cubic  polynomial  such  that 


and 

for  i=l,2.  Then 


■ f0  , Q'  (x0)  * i'0  , 
Q'CXi)  - f ' CYi) 
f(x)  - Q00  - 0(64) 


for  x between  Xq  and  x^  , but 


f(xN)  - Q(xn)  = 0(6°)  , 


because  of  our  choice  of  oij  and  a2  as  zeros  of  G2(2,  2,x) 


inTitiiiiiiiUHi 


6. 


(This  might  be  called  "superconvergence" : see  de  Boor  and 

Swartz  [73].) 

A Problem 

Since 

- Xj  = 0 (62) 

and 

f'(x)  - Q*(x)  - 0(i3) 

for  x near  Xj^  , proceeding  as  above  gives 

I f (Xj) | « 0(66)  . 0(63-62)  . 0(65)  , 

so  the  method  is  only  of  order  five,  not  six. 

A Solution 

After  evaluating  f'(y.)  , we  can  find  an  approximation 

j * 

x^  = C + 0(6  ) which  is  (in  general)  a better  approximation 
to  £ than  is  x^  . From  the  above  discussion,  we  can  get  a 
sixth-order  method  if  we  can  ensure  superconvergence  at  x^ 
rather  than  x^  . Define  by 

®1^*N  " xQ)  = - *0)  • 

In  evaluating  f'  at  = Xq  + 5^  (x^  - xQ),  we  effectively 
used  ctj  = + 0(6)  instead  of  a1  , so  we  must  perturb  a2 

to  compensate  for  the  perturbation  in  . 

From  (3.1),  we  want  such  that,  for  some  Wj  and 


(3.2)  WjSj  + w25  2 = Ui  + 2) 

for  0 < i < 2 . Thus 


7 


which  gives 

52  - (3  - 4ax)/(4  - 65^  = a2  + 0(6)  . 


Since 


Jr  = Wj  + 0(6) 

for  j*l,2,  we  have 

(3.3)  w^  + w252  = 1/5  + 0(6)  . 


(Compare  (3.1)  with  i ■ 3.)  If  we  evaluate  f'  at 
y2  * XQ  + " V * an<*  let  X1  k®  a sufficiently  good 

approximation  to  the  appropriate  zero  of  the  cubic  which  fits 
the  data  obtained  from  the  f and  f'  evaluations,  then 
(3.2)  and  (3.3)  are  sufficient  to  ensure  that  the  method  has 
order  six  after  all. 

4.  METHODS  OF  ORDER  2v 


In  this  section  we  describe  a class  of  methods  satisfying 
Theorem  1.1.  The  special  cases  V = 2 and  v = 3 have  been 
given  above. 


It  is  convenient  to  define  n = v - 1 . The  Jacobi  poly- 
nomial Gn(2,  2,  x)  is  the  monic  polynomial,  of  degree  n , 
which  is  orthogonal  to  all  polynomials  of  degree  n - 1 , with 
respect  to  the  weight  function  x , on  [0,  1].  Let  ai»***>an 
denote  the  zeros  of  Gn(2,  2,  x)  in  any  fixed  order.  We  des- 
cribe a class  of  methods  of  order  2(n  + 1)  , using  evaluations 
of  f(xQ)  , f'(x0)  , and  f'  (y1), . . . ,f'  (yn)  , where  the 

points  yr...,yn  are  determined  during  the  iteration. 

The  Methods 


1.  Evaluate  fQ  = f(xQ)  and  f^  = f * CXQ)  • 

2.  If  fg  = 0 set  Xj  = xQ  and  stop,  else  set  6 = |fg/f^j. 


3. 


For  i=l,...,n  do  steps  4 to  7. 


4.  Let  be  the  polynomial,  of  minimal  degree,  agree- 
ing with  the  data  obtained  so  far.  Let  z^  be  an 

approximate  zero  of  p.  , satisfying  z.  = xn  + 0(6) 

i+2  1 i u 

and  pi(z^)  =0(6  ) . (Any  suitable  method,  e.g. 

Newton’s  method, may  be  used  to  find  z^  .) 

5.  Compute  a.^  = j C2^  - Xq)/^  " V for 

j=l,...,i-l.  (Skip  if  i = 1.) 


6.  Let  q.  be  the  monic  polynomial,  of  degree 


7. 


8.  Let 


n + 1 - i , such  that 


P(x)  q.(x) 


fi-1 

n (x-a.  .) 

Li =1  1»j 


6 1 Lj*r 

= 0 for  all  polynomials  P of  degree  n 
(The  existence  and  uniqueness  of  q^ 
constructively:  see  Brent  [75].)  Let  a 


xdx 


approximate  zero  of  q.  , satisfying 

fi+1,1 


be  an 

“m*  v°'6> 


and  qi(ai^)  = 0(6  ) . 

Evaluate  f(yp  , where 

Xi  = x0  ♦ a._.(z.  - x0)  . 

Pn+1  be  as  at  step  4,  and  x^  an  approximate  zero 

of  pntl  ’ satisfyin8  xi  = xo  + and  Pn+l(xP  = 
0(62n+5)  . 


Asymptotic  Error  Constants 

The  asymptotic  error  constant  of  a stationary  zero- 
finding method  is  defined  to  be 

K = x0+C  (xl  ‘ ^/(x0  " * 

where  p is  the  order  of  convergence.  (Since  p is  an 
integer  for  all  methods  considered  here,  we  allow  K to  be 
signed.)  Let  be  the  asymptotic  error  constant  of  the 
methods  (of  order  2v)  described  above.  The  general  form  of 
is  not  known,  but  we  have 


where 


h * *2  • 

K2  = V9  - *2*3  * 

K3  = 4>6/100  + (1  - 50^2^/10  + (30^  - 2)<f>34>4/5  , 

K4  = |34>s  - 21«*>24»7/(1  - ap  + 9 [35  (1  - a3)-3/(l  - a Jit  fa 
- 25(9  - 44a3  + 42aj)4>44>5}/3675  , 

* - p£)- . 

dTf'S) 


5.  RELATED  NONLINEAR  RUNGE-KUTTA  METHODS 

The  ordinary  differential  equation 

(5.1)  dx/dt  = g(x)  , x(tQ)  = xQ  , 

may  be  solved  by  quadrature  and  zero-finding:  to  find 

x(tQ  + h)  we  need  to  find  a zero  of 

* / jtHv  - h . 


Note  that  f(xQ)  * - h is  known,  and  f'(x)  = l/g(x)  may  be 
evaluated  almost  as  easily  as  g(x)  . Thus,  the  zero-finding 
methods  of  Section  4 may  be  used  to  estimate  x(tg  + h)  , then 
x(tp  + 2h)  , etc.  When  written  in  terms  of  g rather  than  f, 
the  methods  are  seen  to  be  similar  to  Runge-Kutta  methods. 

For  example,  the  fourth-order  zero-finding  methods  of 
Section  2 (with  x^  an  exact  zero  of  the  quadratic  Q(x)  ) 
gives: 

?0  • *(*„)  . 

A • hgQ  , 


gj  = g(x0  + 2A/3) 


and 

(5.2)  Xj  = xQ  + 2A/U  * (3g0/gl  - 2)^)  . 

Note  that  (5.1)  is  nonlinear  in  and  gj  , unlike  the 
usual  Runge-Kutta  methods.  (This  makes  it  difficult  to 
generalize  our  methods  to  systems  of  differential  equations.) 
Since  the  zero-finding  method  is  fourth-order,  x1  = x(tn  + h) 

+ 0(h  ) , so  our  nonlinear  Runge-Kutta  method  has  order  three 
by  the  usual  definition  of  order  (Henrici  [62]). 

Similarly,  any  of  the  zero-finding  methods  of  Section  4 
have  a corresponding  nonlinear  Runge-Kutta  method.  Thus,  we 
have: 

Theorem  5.1 

If  v > 0 , there  is  an  explicit,  nonlinear,  Runge-Kutta 
method  of  order  2v  - 1 , using  v evaluations  of  g per 
iteration,  for  single  differential  equations  of  the  form  (5.1). 

By  the  result  of  Meersman  and  Wozniakowski,  mentioned  in 
Section  1,  the  order  2v  - 1 in  Theorem  5.1  is  the  best  poss- 
ible. Butcher  [65]  has  shown  that  the  order  of  linear  Runge- 
Kutta  methods,  using  v evaluations  of  g per  iteration,  is 
at  most  v , which  is  less  than  the  order  of  our  methods  if 
v > 1 (though  the  linear  methods  may  also  be  used  for  systems 
of  differential  equations) . 

6.  SOME  NUMERICAL  RESULTS 

In  this  section  we  give  some  numerical  results  obtained 
with  the  nonlinear  Runge-Kutta  methods  of  Section  5.  Consider 
the  differential  equation  (5.1)  with 

(6.1)  g(x)  = (27T)Jsexp(x2/2) 

and  x(0)  = 0 . Using  step  sizes  h = 0.1  and  0.01,  we 

estimated  x(0.4)  , obtaining  a computed  value  x^  . The 


error  was  defined  by 

e.  * (2tt)_Js  j1  exp(-u2/2)du  - 0.4  . 

0 

All  computations  were  performed  on  a Univac  1108  computer, 
with  a floating-point  fraction  of  60  bits.  The  results  are 
summarized  in  Table  6.1.  The  first  three  methods  are  derived 
from  the  zero-finding  methods  of  Section  4 (with  v = 2,  3 and 
4 respectively).  Method  RK4  is  the  classical  fourth-order 
Rungt-Kutta  method  of  Kutta  [01] , and  method  RK7  is  a seventh- 
order  method  of  Shanks  [66] . 


Table  6.1:  Comparison  of  Runge-Kutta  Methods 


Method 

g evaluations 
per  iteration 

Order 

eo.i 

a 

o 

o 

i— » 

Sec.  4 

2 

3 

-9.45'-6 

1.49'-7 

Sec.  4 

3 

5 

3.16’-6 

1 

r- 

• 

(N 

1 

Sec.  4 

4 

7 

3.86’-8 

3.69'-15 

RK4 

4 

4 

1.95* -5 

7. 90' -9 

RK7 

9 

7 

-5.19’-7 

— 1 . 67  * -13 

More  extensive  numerical  results  are  given  in  Brent  [75] . 
Note  that  the  differential  equation  (6.1)  was  chosen  only  for 
illustrative  purposes:  there  are  several  other  ways  of 

computing  quantiles  of  the  normal  distribution.  A practical 
application  of  our  methods  (computing  quantiles  of  the  incom- 
plete Gamma  and  other  distributions)  is  described  in  Brent 
[76]. 

7.  OTHER  ZERO-FINDING  METHODS 

In  Section  1 we  stated  some  generalizations  of  our 
methods  (see  Theorem  1.2).  Further  generalizations  are  des- 
cribed in  Meersman  [75] . Kacewicz  [75]  has  considered  methods 
which  use  information  about  an  integral  of  f instead  of  a 
derivative  of  f . 


L 


"Sporadic”  methods  using  derivatives  may  be  derived  as  in 
Sections  2 and  3.  For  example,  is  there  an  eighth-order 
method  which  uses  evaluations  of  f , f'  , f”  , and  f"'  at 


Xq  , followed  by  evaluations  of  f'  , f"  and  f"'  at  some 


point  y^  ? Proceeding  as  in  Sections  2 and  3 , we  need  a 
nonzero  a satisfying 


rank 


1 

1 

1 

1 

4 

5a 

6a2 

7a' 

12 

20a 

30a2 

42a' 

24 

60a 

120a2 

210a' 

= 3 , 


which  reduces  to 
(7.1) 


35a3  - 84a2  + 70a  - 20  = 0 . 


Since  (7.1)  has  one  real  root,  a = 0.7449...,  an  eighth-order 
method  does  exist.  It  is  interesting  to  note  that  (7.1)  is 
equivalent  to  the  condition 

.3 , 


1 * 

/ xix  - a)Jdx  = 0 . 
0 


As  a final  example,  we  consider  sixth-order  methods 
using  f(xQ)  , f’(Xg)  , f,,(y1)  , and  f'"(y2)  . (These 

could  be  called  Abel-Goncarov  methods.)  Proceeding  as 

above,  we  need  a^  and  a2  such  that 


rank 


2 

0 

1 


6a, 


12al 


6 

1 


24a, 


20a 

60a 

1 


= 2 


which  gives 
(7.2) 


60a*  - 80a3  + 60a2  - 24ax  +3  = 0 


mmm mummu 


13. 


and  2 

a2  = (1  - 6ap/ (4  - 12^)  . 

Fortunately,  (7.2)  has  two  real  roots,  = 0.2074...  and 
ot ^ = 0.5351...  Choosing  one  of  these,  we  may  evaluate  f(xQ), 
f ' (Xq)  and  f'(y^)  , where  is  defined  as  in  Section  3. 

We  may  then  fit  a quadratic  to  the  data,  compute  the  perturbed 
6^  , and  take 

a2  - (1  - 6oJ)/(4  - 125,)  , 

etc.,  as  in  Section  3.  It  is  not  known  whether  this  method 
can  be  generalized,  i.e.,  whether  real  methods  of  order  2n  , 
using  evaluations  of  f(xQ)  , f * (xQ)  , f"^)  , ...,  f^(ynl), 
exist  for  all  positive  n . 
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